Efficient measurement of quantum gate error by interleaved randomized benchmarking 
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We describe a scalable experimental protocol for obtaining estimates of the error rate of individual 
quantum computational gates. This protocol, in which random Clifford gates are interleaved between 
a gate of interest, provides a bounded estimate of the average error of the gate under test, so long as 
the average variation of the noise affecting the full set of Clifford gates is small. This technique takes 
into account both state preparation and measurement errors and is scalable in the number of qubits. 
We apply this protocol to a superconducting qubit system and find a gate error of 0.0021qqq2 for 
the single-qubit gates ^7^/2 and 1^7^/2, and compares favorably with the gate errors extracted via 
quantum process tomography. 



Determining how well an operation is implemented on 
a quantum device is of fundamental importance in quan- 
tum information theory. Such a characterization allows 
a direct comparison between different architectures for 
computation as well as an understanding of the perfor- 
mance of the building blocks of a quantum computer. 
The standard method for characterizing a quantum oper- 
ation is quantum process tomography (QPT) [1,2] which 
is subject to two significant drawbacks: first, it is not 
scalable in the number of sub-systems (qubits) compris- 
ing the system; and second if state-preparation and mea- 
surement errors (SPME) are present, then these errors 
will contribute to that of the gate being characterized, 
hence giving an unfaithful estimation of the actual error. 
In many cases, one does not require the complete knowl- 
edge that QPT aims to provide. As a result, various 
methods for partially characterizing a quantum opera- 
tion have been developed [ - ]. Ideally such a method 
should be scalable in the number of qubits, n, comprising 
the system as well as provide a faithful measure of the 
noise that is independent of SPME. 

One particular method for partial noise characteriza- 
tion is "randomized benchmarking" (RB) [4, 11, 12], with 
Ref. [ ] providing the first scalable RB protocol that sat- 
isfies all of the above criteria. The general idea of RB is to 
implement random sequences of gates that form an iden- 
tity operation, and measure the fidelity of each sequence. 
Averaging over different realizations results in a fidelity 
decay versus the sequence length, from which the aver- 
age error over the full gate set is estimated via fitting the 
curve to a derived model. The simplicity of this protocol 
has lead to various experimental implementations of the 
single-qubit gate protocol presented in Ref. [ ], includ- 
ing in atomic ions with different types of traps [ - ] , 
liquid state nuclear spins [ ], superconducting qubits 
[16- ], and atoms in optical lattices [19]. 

The multi-qubit RB protocol described in Ref. [ ] is 
restricted to benchmark only the full Clifford group on 
n qubits, Clifn- While this provides a significant step 



towards scalable benchmarking of a quantum informa- 
tion processor it is desirable in many cases to bench- 
mark individual gates in Clifn rather than the entire set. 
One method for characterizing the fidelity of single Clif- 
ford gates has been provided in Ref. [20], proposing an 
extension of the protocol introduced in Ref. [^]. The 
main drawback of this method is that it does not ac- 
count for SPME which can bias estimates of the gate er- 
ror. Note that benchmarking Clifford gates rather than 
general elements of the unitary group is not a signifi- 
cant restriction as any unitary gate can be implemented 
with fault-tolerance using special input states, Clifford el- 
ements and computational basis measurements [ ] . Ad- 
ditionally, the unitary group can be generated via Clif^ 
through the addition of a single gate not in the group [22] . 
Thus, benchmarking Clifford elements provides signicant 
information regarding the reliability of a general quan- 
tum gate, and is the relevant metric for fault-tolerant 
thresholds [22-25]. 

In this Letter, we present a new protocol for bench- 
marking individual Clifford gates via randomization. Our 
protocol consists of interleaving random gates between 
the gate, C, of interest. In the limits of either perfect 
random gates or that the average error of all gates is 
depolarizing, our protocol estimates the gate error of C 
perfectly. In the completely general case where the ran- 
dom gates have arbitrary errors with small average varia- 
tion, we provide explicit bounds for the error of C. These 
bounds give direct information regarding the quality of 
computational gates and thus useful information about 
reaching thresholds for fault-tolerant quantum computa- 
tion [22-25]. The method utilizes many of the techniques 
of Ref. [^ ^] and similarly is scalable and independent of 
SPME. Finally, we experimentally demonstrate this pro- 
tocol on a superconducting qubit, extracting a gate error 
of 0.002^QQQ2 for both X^i2 and y^/2 gates {Ue is a rota- 
tion of around axis /7), comparing favorably with gate 
errors extracted via QPT (0.010 and 0.007 respectively). 

Interleaving benchmarking protocol. — To benchmark 
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FIG. 1. (color online) Randomized benchmarking protocols. 
(a)-(b) Schemes for the standard and interleaved benchmark- 
ing protocols. The target gate, C (green) is interleaved with 
random gates d (orange) chosen from Clifn- A final gate 
Cm+i (red) is performed to make the total sequence equal to 
the identity operation. 



the Clifford element C, which has an associated noise 
operator Ac, we fix an initial state \ip) and perform the 
following steps: 

Step 1: Implement standard randomized benchmark- 
ing [see Fig. 1(a)] which, for completeness, we briefly 
summarize here (additional details in Ref. [ , ]). For 
various values of tti, choose K sequences of random gates 
where the first m gates are chosen uniformly at random 
from Clifn- The {m-\-l)th gate is chosen to be the inverse 
of the composition of the first m random gates and can 
be found efficiently by the Gottesman-Knill theorem [ ] . 
Assuming each Clifford element Ci . has some associated 
error, represented by A^ . , the sequence of gates is mod- 
eled by 
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where im is the m-tuple (n, ...,im) and im-\-i is uniquely 
determined by im- Next, measure the probability 
that the initial state is not changed by the sequence, 
Tr[^^<Si^(p^)], which we call the "survival probability". 
Here p^ is a quantum state that takes into account state- 
preparation errors and E^ is the positive operator valued 
measure element that takes into account measurement er- 
rors. In the ideal (noise-free) case the survival probability 
will be 1 for each sequence. Averaging the survival prob- 
ability over the K sequences gives the sequence fidelity 
Fseq{m^ilj) and a fit Fseq{m,ilj) to either the zeroth or 
first order model: 



F«(m, V) =Aip™ + Ci(m - l)p™-2 + Bi, 



(2) 



gives the depolarizing parameter p (the average error rate 
over all Clifford gates is given by r = (d — 1)(1 — p)/d). 
The coefficients ^i(o) 7 ^i(o) 7 and Ci absorb the state 
preparation and measurement errors as well as the error 
on the final gate. 

Step 2: Choose K sequences of Clifford elements where 
the first Clifford C^^ in each sequence is chosen uniformly 
at random from Clifn, the second is always chosen to be 



C, and alternating between uniformly random Clifford el- 
ements and deterministic C up to the mth random gate 
[see Fig. 1(b)]. The (m + l)th gate is chosen to be the in- 
verse of the composition of the first m random gates and 
m interlaced C gates (we adopt the convention of defin- 
ing the length of a sequence by the number of random 
gates). The superoperator representing the sequence is 

Vi„ = A,„+„™+i o Ci^^, o (O™ 1 [C o Ac o Ai^j o CiJ ) . 

(3) 
For each of the K sequences, measure the survival prob- 
ability TT[E^Vi^{pip)] and average over the K random 
sequences to find the new sequence fidelity Fseq(m,'0). 
Fit Fseq(^n, iIj) to ouc of the new zeroth or first order 
models to obtain the depolarizing parameter p^. The ex- 
pressions for these models are given by Eq. (2) where p 
is replaced by the new depolarizing parameter p^. 

Step 3: From the values obtained for p (Step 1) and 
p-^ (Step 2), the gate error of Ac (which is exactly given 
by re = 1— average gate fidelity of Ac) is estimated by 
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Here E is the error in our estimate and is given by (see 
later for derivation) 



( {d-l)[\p-p^/p\^{l-p)] 



E = min < 



2(^2 - i)(i - p) 4y/T^V^ 
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One interpretation of E is that it arises from imperfect 
random gates. To see this, first note that in the limit of 
perfect random gates, p ^ 1, r^^* goes to the standard 
error for a depolarizing channel with strength pc (equiva- 
lently r^®* ^ ^c)? and E goes to zero. In the more specific 
case of A being a Pauli channel, one can replace the sec- 
ond possibility in Eq. (5) with 2{d'^ — 1)(1 —p)/pd^ and 
when A is depolarizing, E = {). We expect that in the 
typical case A will be close to a depolarizing channel and 
these error bounds will be an over estimate. 

Experimental implementation. — Using the new proto- 
col, we verified the performance of two single-qubit gates 
on a superconducting transmon qubit. The device is sim- 
ilar to the one described in Ref. [ ], but we focus on just 
a single qubit with co'oi/27r = 4.76093 GHz, anharmonic- 
ity of (cji2 — cJoi)/27r = —234 MHz, and coherence times 
of Ti = 8.1 /is and T^'^'^'' = 16 /is. 

Single-qubit control was performed by means of shaped 
microwave pulses applied to capacitively-coupled bias 
lines that address individual qubits. We used Gaussian 
shaped pulses with a derivative envelope applied to an 
orthogonal quadrature to minimize errors due to higher 
levels of the transmon [ ]. The Gaussian width was 
cr = 5 ns and the pulses were truncated to have a total 
duration of 4cr = 20 ns. A pulse calibration procedure 



was used which employed several sequences of repeated 
pulses that amplify small rotation angle and phase er- 
rors. A Levenberg-Marquadt search provided all cali- 
brated pulse parameters in a few minutes [? ]. 

To perform standard randomized benchmark- 
ing, we chose a Clifford generating set of 
Q = {Id,X±^/2,X^,Y±^/2,Y^,Z±^/2,Z^}. The Z- 
rotations were performed by a rotating frame definition 
change which is implemented by changing the phase 
of subsequent pulses. Each Clifford gate in a random 
sequence is performed by a random choice from the set 
of minimal length constructions of that gate. For the 
generating set ^, a Clifford gate has an average length of 
^1.6 pulses. To find the average fidelity for sequences 
of length A/", we create 32 random sequences of N -\- 1 
Clifford gates and measure (a^) at the end of each, then 
average the results. Figure 2(a) shows the measured av- 
erage fidelities (blue circles) versus sequence length. The 
data fit well to the first model of Eq. (2) with p = 0.993, 
corresponding to an estimated average error rate for 
the entire Clifford group of r = (1 — p)/2 = 0.0035, 
which is very close to the expected error of 0.0025 from 
decoherence. 

Since the Clifford generating rotations in Q can each be 
implemented with just a single pulse, we expect a lower 
error rate for such gates than the average over the en- 
tire group. We verify this for X^^2 and y^/2 gates with 
interleaved benchmarking. Since basis changes swap X 
and Y rotations, we remove Z rotations from Q for con- 
structing random interleaved sequences. This increases 
the average length of a Clifford gate to 1.875 pulses. The 
resulting average fidelities for the X^^2 and 1^7^/2 inter- 
leaved sequences are shown in Fig. 2(a) as orange trian- 
gles and red diamonds, respectively. The fidelities are 
lower than the standard RB results because of an effec- 
tive doubling in the number of pulses in the interleaved 
sequences. Both sequences fit to a model with the same 
p-^ = 0.989. By Eqs. (4) and (5), this gives an estimated 
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error rate for Xj^/2 and Y^/2 gates of r^ = 0.002 

To further test the robustness of the technique, we also 
intentionally introduce additional error on a target gate 
to test the sensitivity of the interleaved benchmarking 
protocol to calibration errors. These results are summa- 
rized in Table I. For the small set of calibration errors in- 
troduced, the model reliably tracks the anticipated pulse 
infidelity. 

We compare the interleaved RB result to the standard 
method of measuring gate performance by performing 
QPT. The process matrices for the X^^2 and 1^7^/2 gates 
are shown in Fig. 2(b) in the Pauli basis of the Liouville 
representation (also known as the Pauli transfer map, see 
[30]). To extract these maps we employ maximum like- 
lihood estimation (MLE) to ensure that the maps are 
physical (we require the maps to be completely positive, 
but allow them to be non-trace-preserving because of po- 
tential leakage out of the qubit space). The gate fideli- 
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FIG. 2. Experimental implementation of interleaved RB. 
(a) Measurement of average fidelity over 32 random sequences 
each of lengths between 2 and 96. The blue data (circles) 
show the result of the standard RB protocol, while red (tri- 
angles) and orange (diamonds) data correspond to interleaved 
sequences for X-j^/2 and Yj^/2 gates, respectively. All data are 
well described by the first model of Eq. (2), with p = 0.993 
(standard RB), p-^ = 0.989 (interleaved ^7^/2) and p-^ = 0.989 
(interleaved 1^7^/2)- Error bars are the standard error of the 
mean of each point, (b) Pauli transfer maps from process 
tomography of the X-j^/2 and 1^7^/2 gates with corresponding 
gate fidelities of 0.990 and 0.993, respectively. 
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0.0 


0.000 


0.002 ± 0.005 
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0.004 


0.004 ± 0.004 
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0.016 


0.022 ±0.022 



TABLE I. Gate errors extracted with interleaved RB for in- 
tentional pulse miscalibration errors of Xt^/2- The first col- 
umn is the applied over-rotation about the X axis with pre- 
dicted Ac = exp[— zecr2;/2], the second column is found using 
^th = 2(1 — cos^(e/2))/3, and the third is the experimentally 
extracted gate errors via interleaving. 



ties extracted from these maps are J^ = 0.990 {X^^2) and 
0.993 (1^7r/2)- We attribute the small increase in error seen 
in QPT to preparation and measurement errors. Addi- 
tionally, the use of MLE leads to difficulties in assign- 
ing error bars to the fidelities. Consequently, interleaved 
benchmarking provides a more reliable estimate for the 
performance of Clifford gates. 

Derivation of the fitting models, gate errors and 
bounds. — The main idea behind the derivation of the 
fitting models is the following "unitary 2-design" prop- 
erty of the Clifford group: If A is a quantum channel and 



Clifn = {C} then the "twirl" of A, W(A), defined by 

IClif^l 

^(^)^=^^i^ EC.oAoCJ (6) 



ICliL 



j=i 



is the unique depolarizing channel A^^ with the same av- 
erage fidelity as A [ ] . The average fidelity of A is given 

by 



(7) 



-^A,i:=tr(|0)(0|A(|0)(0|)) 



which is just the average over all pure states \(j)){(})\ of 
the usual fidelity between the output state A(|0)(^|)) 
and input state \(j)){(j)\. Hence if A^^ is given by ^^d{p) = 
W+(l-p)| then JI^ = p+i^. 

We now provide a brief overview of the derivation of 
the fitting models. Defining Vi. = C^. o Os=i P ° ^* J 
allows us to write the interleaving sequence as 



Vi„ 



(Ai™+i ) ° (0r=i [^i, ^ o Ac o Ai^ o Vi^ ] ) . (8) 



The zeroth order model corresponds to the noise being 
independent of the gate, i.e. A^^. = A is independent of 
Vi- for every j. In this case when we average over many 
sequences in Eq.(8) we obtain a composition of twirls 
of Ac o A. Hence we obtain a composition of depolar- 
izing channels A^ ^ = (A^ ^ ^)d where for any state p, 

^c dip) ~ PcP +(-'-" Pc)l^' Here, 1 — p-^ corresponds to 
the depolarizing strength of A^ o A. 

The first order model corresponds to the case where 
the noise depends on the gate. In this case, we apply a 
perturbative argument similar in nature to that of Ref. 
[11] (for more details see Ref. [ ]) to derive the fit- 
ting model. Each A^ is perturbed about the average 
of all the A^, denoted by A, and provided the average 
variation of the ||(^Ai||, 7 := -^^ ^^ ll^^ill? is small (ie. 
7^ <C 2/[m{m + 1)]) the first order model is a valid de- 
scription of the fidelity decay curve. Note that the norm 
II • II can be any norm satisfying certain general proper- 
ties (see Ref [ ] for more detail). One usually chooses 
the weakest norm satisfying these properties which al- 
lows for the largest class of gate-dependent errors. It is 
important to emphasize that the type of the noise is irrel- 
evant for this sufficient condition, as long as the average 
of the magnitudes is sufficiently small, the analysis can 
be terminated at ffrst order. 

We now outline how to obtain the expression for the 
error given by Eq. (4) as well as the various expressions 
for E given by Eq.'s (5) and in the surrounding text. Let 
us begin by looking at the difference in average fidelity 
between A^ = A^ o A and Aj := Ac o A^^ 
Since A^^ is depolarizing. 
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Hence, 



Fa^ 



dp 



E<Fac,i< 



Fhr' 



dp 



E (10) 



where E is an upper bound for |i^A^,x — ^Ac,x| /p- Using 
re := l-T\~i^nd T\~z = Pc^ ^^^ we find Eq. (4). 
The first expression in Eq. (5) can be obtained by 
noting that |^a^,x — ^Aj^^I ^^^ ^^ upper bounded by 



Fa^^i ■ 



pF, 



A,X ■ 



(l-p) 



-p\Fk^x-Fac 



< 



{d-l)[\p^-p^\^{p-p^)] 



(11) 



(d-i)(i-p) 



where we have used |i^A,x — ^Ac,x| < ^ 

The second expression in Eq. (5) is obtained by first 
noting that. 



|FA^,I-FA,-,l|<||A-Ad|| 



(12) 



where || ||« is the "diamond norm" [ ]. By the triangle 
inequahty, 



\Fk^,x-Fa.,i\<\\A-I\\ 



2(cf-l){l-p) 



, (13) 



where ||A,-J|U=^(^!^i)(i^ 
that for arbitrary A [ ], 



d^ 

It can be shown 



\\A-I\U<A^T^sJdP-l, 



(14) 



which trivially gives the second expression in Eq. (5). 

In the case of A being equal to a Pauli channel, ||A — 
X||o = 2{d^ - 1)(1 -p)/d^ always holds [ ]. Lastly, the 
depolarizing case is obtained by noting that A^ = Aj and 
as such £^ = by definition. 

Scalability and robustness to state-preparation and 
measurement errors. — The fact that our protocol is scal- 
able in the number of qubits and, except in highly unre- 
alistic cases, independent of SPME follows directly from 
the form of the protocol being similar in nature to the 
one given in Ref. [11]. Indeed, one can show using a sim- 
ilar argument that the time-complexity of the protocol 
presented here is bounded above by O (n^) . SPME af- 
fect the analysis only when they conspire to produce a 
constant fidelity decay curve so that one cannot obtain 
an estimate for p^. As was shown in Ref. [ ] , the situa- 
tions for which this occurs are highly unphysical and can 
effectively be ignored. 

Conclusion. — We have presented a scalable protocol 
for benchmarking individual quantum gates. We explic- 
itly derive various bounds for the error of the imperfect 
gate in terms of parameters that are output from the 
protocol. The gate error can be estimated exactly in the 
limit of perfect gates or if the average of the error oper- 



ators over all gates is depolarizing, which we believe is 
close to the typical case. The method is scalable in the 
size of the quantum system and is independent of SPME. 
We have applied this protocol to a superconducting qubit 
and shown the gate errors for both a X^i2 and 1^7^/2 rota- 
tion to be 0.002to'.oo2 which compares favorably with the 
gates errors extracted by quantum process tomography 
(0.010 and 0.007 respectively). 
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